Practical Lab - 2

Data Visualization¶

Welcome to our introductory lab, where we'll explore the basics of creating various Data visualizations and converting them into widely accessible HTML format for online publishing. This lab is designed to provide a first look at the end-to-end process of writing and sharing analyses using various Data visualization packages

mermaid
graph LR
    Package[Data Visualization]-->matplotlib[Matplotlib - 2D plotting library];
    Package[Data Visualization]-->seaborn[Seaborn - Statistical data visualization];
    Package[Data Visualization]-->plotly[Plotly - Interactive plotting library];
Package Description Installation
Matplotlib Matplotlib is a 2D plotting library for Python. It provides a wide variety of static, animated, and interactive plots. It is widely used for creating publication-quality visualizations. pip install matplotlib
Seaborn Seaborn is built on top of Matplotlib and provides a high-level interface for drawing attractive and informative statistical graphics. It simplifies the process of creating complex visualizations with less code. pip install seaborn
Plotly Plotly is an interactive plotting library that allows users to create dynamic and interactive visualizations. It supports a wide range of chart types and can be used for creating dashboards and web-based applications. pip install plotly

Import the packages¶

In [3]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import plotly.express as px
import plotly

Matplotlib¶

matplotlib

1. Hexbin Plot using Matplotlib¶

The following Python code demonstrates the use of hexbin plots in Matplotlib for visualizing bivariate data.

Sourcecode Reference link:

https://matplotlib.org/stable/gallery/statistics/hexbin_demo.html#sphx-glr-gallery-statistics-hexbin-demo-py

In [4]:
# Fixing random state for reproducibility
np.random.seed(19680801)

n = 100_000
x = np.random.standard_normal(n)
y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)
xlim = x.min(), x.max()
ylim = y.min(), y.max()

fig, (ax0, ax1) = plt.subplots(ncols=2, sharey=True, figsize=(9, 4))

hb = ax0.hexbin(x, y, gridsize=50, cmap='inferno')
ax0.set(xlim=xlim, ylim=ylim)
ax0.set_title("Hexagon binning")
cb = fig.colorbar(hb, ax=ax0, label='counts')

hb = ax1.hexbin(x, y, gridsize=50, bins='log', cmap='inferno')
ax1.set(xlim=xlim, ylim=ylim)
ax1.set_title("With a log color scale")
cb = fig.colorbar(hb, ax=ax1, label='log10(N)')

plt.show()

2. Electrical Dipole Visualization using Matplotlib¶

The code utilizes Matplotlib to create a visualization of the electrical potential and gradient of an electrical dipole.

Sourcecode Reference link:

https://matplotlib.org/stable/gallery/images_contours_and_fields/trigradient_demo.html#sphx-glr-gallery-images-contours-and-fields-trigradient-demo-py

In [5]:
from matplotlib.tri import (CubicTriInterpolator, Triangulation,
                            UniformTriRefiner)


# ----------------------------------------------------------------------------
# Electrical potential of a dipole
# ----------------------------------------------------------------------------
def dipole_potential(x, y):
    """The electric dipole potential V, at position *x*, *y*."""
    r_sq = x**2 + y**2
    theta = np.arctan2(y, x)
    z = np.cos(theta)/r_sq
    return (np.max(z) - z) / (np.max(z) - np.min(z))


# ----------------------------------------------------------------------------
# Creating a Triangulation
# ----------------------------------------------------------------------------
# First create the x and y coordinates of the points.
n_angles = 30
n_radii = 10
min_radius = 0.2
radii = np.linspace(min_radius, 0.95, n_radii)

angles = np.linspace(0, 2 * np.pi, n_angles, endpoint=False)
angles = np.repeat(angles[..., np.newaxis], n_radii, axis=1)
angles[:, 1::2] += np.pi / n_angles

x = (radii*np.cos(angles)).flatten()
y = (radii*np.sin(angles)).flatten()
V = dipole_potential(x, y)

# Create the Triangulation; no triangles specified so Delaunay triangulation
# created.
triang = Triangulation(x, y)

# Mask off unwanted triangles.
triang.set_mask(np.hypot(x[triang.triangles].mean(axis=1),
                         y[triang.triangles].mean(axis=1))
                < min_radius)

# ----------------------------------------------------------------------------
# Refine data - interpolates the electrical potential V
# ----------------------------------------------------------------------------
refiner = UniformTriRefiner(triang)
tri_refi, z_test_refi = refiner.refine_field(V, subdiv=3)

# ----------------------------------------------------------------------------
# Computes the electrical field (Ex, Ey) as gradient of electrical potential
# ----------------------------------------------------------------------------
tci = CubicTriInterpolator(triang, -V)
# Gradient requested here at the mesh nodes but could be anywhere else:
(Ex, Ey) = tci.gradient(triang.x, triang.y)
E_norm = np.sqrt(Ex**2 + Ey**2)

# ----------------------------------------------------------------------------
# Plot the triangulation, the potential iso-contours and the vector field
# ----------------------------------------------------------------------------
fig, ax = plt.subplots()
ax.set_aspect('equal')
# Enforce the margins, and enlarge them to give room for the vectors.
ax.use_sticky_edges = False
ax.margins(0.07)

ax.triplot(triang, color='0.8')

levels = np.arange(0., 1., 0.01)
ax.tricontour(tri_refi, z_test_refi, levels=levels, cmap='hot',
              linewidths=[2.0, 1.0, 1.0, 1.0])
# Plots direction of the electrical vector field
ax.quiver(triang.x, triang.y, Ex/E_norm, Ey/E_norm,
          units='xy', scale=10., zorder=3, color='blue',
          width=0.007, headwidth=3., headlength=4.)

ax.set_title('Gradient plot: an electrical dipole')
plt.show()

Seaborn¶

Seaborn

1. KDE Plot for Diamond Dataset using Seaborn¶

The code uses Seaborn to create a kernel density estimate (KDE) plot for the distribution of clarity ratings in the diamonds dataset, conditional on carat.

Sourcecode Reference link:

https://seaborn.pydata.org/examples/multiple_conditional_kde.html

In [6]:
sns.set_theme(style="whitegrid")

# Load the diamonds dataset
diamonds = sns.load_dataset("diamonds")

# Plot the distribution of clarity ratings, conditional on carat
sns.displot(
    data=diamonds,
    x="carat", hue="cut",
    kind="kde", height=6,
    multiple="fill", clip=(0, None),
    palette="ch:rot=-.25,hue=1,light=.75",
)
Out[6]:
<seaborn.axisgrid.FacetGrid at 0x176c3ad5f00>

2. Random Walk Visualization using Seaborn¶

The provided Python code utilizes Seaborn to create a visualization of multiple short random walks. The visualization includes individual trajectories of each random walk, starting points, and an organized grid layout.

Sourcecode Reference link:

https://seaborn.pydata.org/examples/many_facets.html

In [7]:
sns.set_theme(style="ticks")

# Create a dataset with many short random walks
rs = np.random.RandomState(4)
pos = rs.randint(-1, 2, (20, 5)).cumsum(axis=1)
pos -= pos[:, 0, np.newaxis]
step = np.tile(range(5), 20)
walk = np.repeat(range(20), 5)
df = pd.DataFrame(np.c_[pos.flat, step, walk],
                  columns=["position", "step", "walk"])

# Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(df, col="walk", hue="walk", palette="tab20c",
                     col_wrap=4, height=1.5)

# Draw a horizontal line to show the starting point
grid.refline(y=0, linestyle=":")

# Draw a line plot to show the trajectory of each random walk
grid.map(plt.plot, "step", "position", marker="o")

# Adjust the tick positions and labels
grid.set(xticks=np.arange(5), yticks=[-3, 3],
         xlim=(-.5, 4.5), ylim=(-3.5, 3.5))

# Adjust the arrangement of the plots
grid.fig.tight_layout(w_pad=1)

Plotly¶

plotly

1. Polar Bar Chart for Wind Data using Plotly¶

The following Python code utilizes Plotly to create a polar bar chart for visualizing wind data. The chart displays the frequency of wind directions with varying strengths.

Sourcecode Reference Link:

https://plotly.com/python/wind-rose-charts/

In [8]:
plotly.offline.init_notebook_mode()
df = px.data.wind()
fig = px.bar_polar(df, r="frequency", theta="direction", color="strength", template="plotly_dark",
            color_discrete_sequence= px.colors.sequential.Plasma_r)
fig.show()

2. Parallel Categories Plot for Tips Dataset using Plotly¶

The provided code utilizes Plotly Express to create a parallel categories plot for the "tips" dataset. The visualization showcases relationships between different categorical variables, with color-coded size information.

Sourcecode Reference Link:

https://plotly.com/python/parallel-categories-diagram/

In [9]:
plotly.offline.init_notebook_mode()
df = px.data.tips()
fig = px.parallel_categories(df, color="size", color_continuous_scale=px.colors.sequential.Inferno)
fig.show()